Definition Extraction using Linguistic and Structural Features

نویسنده

  • Eline Westerhout
چکیده

In this paper a combination of linguistic and structural information is used for the extraction of Dutch definitions. The corpus used is a collection of Dutch texts on computing and elearning containing 603 definitions. The extraction process consists of two steps. In the first step a parser using a grammar defined on the basis of the patterns observed in the definitions is applied on the complete corpus. Machine learning is thereafter applied to improve the results obtained with the grammar. The experiments show that using a combination of linguistic (n-grams, type of article, type of noun) and structural information (layout, position) is a promising approach to the definition extraction task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Definition Extraction Using Conditional Random Fields

Definition Extraction (DE) and terminology are contributing to help structuring the overwhelming amount of information available. This article presents KESSI (Knowledge Extraction System for Scientific Interviews), a multilingual domainindependent machine-learning approach to the extraction of definitional knowledge, specifically oriented to scientific interviews. The DE task was approached as ...

متن کامل

Institute of Phonetic Sciences,

The purpose of this study was to explore the notion of prominence in spoken language. It concentrated on finding an operational definition of prominence, on giving a description of the linguistic and acoustical correlates of prominence, and on analyzing these correlates in terms of their contribution to prominence distinctions. Furthermore, this study was concerned with feature extraction, and ...

متن کامل

Institute of Phonetic Sciences,

The purpose of this study was to explore the notion of prominence in spoken language. It concentrated on finding an operational definition of prominence, on giving a description of the linguistic and acoustical correlates of prominence, and on analyzing these correlates in terms of their contribution to prominence distinctions. Furthermore, this study was concerned with feature extraction, and ...

متن کامل

Extraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency

Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...

متن کامل

Comparing the word definition skill between children with cochlear implant and normal children.

Background:word definition is a linguistic and Meta linguistic skill related to the development of language, academic success and intellectual function. There is a littleresearch in the field of word definition in children with cochlear implant (CI) in Iran. Therefore, the purpose of this study was to examine and compare the word definition in children with CI with normal children. Methods: In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010